Rank | Count | Beginning |
---|---|---|
4621 | 1152 | În |
1993 | 377 | Conform |
6210 | 266 | La |
4 | 247 | A |
3329 | 176 | După |
2739 | 160 | De |
3667 | 141 | El |
7629 | 129 | Pe |
3900 | 119 | Este |
3097 | 116 | Din |
50 | 104 | Aceasta |
2483 | 92 | Cu |
7271 | 91 | O |
8734 | 91 | Se |
9640 | 89 | Un |
7740 | 76 | Pentru |
3515 | 67 | Ea |
183 | 59 | Acest |
163 | 54 | Acesta |
1390 | 51 | Ca |
3003 | 50 | Deși |
6680 | 46 | Mai |
8132 | 45 | Prin |
1024 | 41 | Astfel, |
5898 | 41 | Între |
8573 | 40 | S-a |
7177 | 39 | Nu |
1128 | 38 | Au |
7527 | 38 | Până |
1441 | 37 | Când |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV